Correlation Clustering in Data Streams

نویسندگان

چکیده

Abstract Clustering is a fundamental tool for analyzing large data sets. A rich body of work has been devoted to designing data-stream algorithms the relevant optimization problems such as k -center, -median, and -means. Such need be both time space efficient. In this paper, we address problem correlation clustering in dynamic stream model. The consists updates edge weights graph on n nodes goal find node-partition that end-points negative-weight edges are typically different clusters whereas positive-weight same cluster. We present polynomial-time, $$O(n\cdot {{\,\mathrm{polylog}\,}}n)$$ O ( n · polylog ) -space approximation natural arise. first develop structures based linear sketches allow “quality” given measured. then combine these with convex programming sampling techniques solve problem. Unfortunately, standard LP SDP formulations not obviously solvable -space. Our presents space-efficient required, well approaches reduce adaptivity sampling.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Correlation Clustering in Data Streams

In this paper, we address the problem of correlation clustering in the dynamic data stream model. The stream consists of updates to the edge weights of a graph on n nodes and the goal is to find a node-partition such that the end-points of negative-weight edges are typically in different clusters whereas the end-points of positive-weight edges are typically in the same cluster. We present polyn...

متن کامل

Clustering Data Streams

W e study clustering under the data stream model of computation where: given a sequence of points, the objective is to maintain a consistently good clustering of the sequence observed so far, using a small amount of memory and time. The data stream model i s relevant to new classes of applications involving massive data sets, such as web click stream analysis and multimedia data analysis. W e g...

متن کامل

Clustering Geometric Data Streams

Using recent knowledge in data stream clustering we present a modified approach to the facility location problem in the context of geometric data streams. We give insight to the existing algorithm from a less mathematical point of view, focusing on understanding and practical use, namely by computer graphics experts. We propose a modification of the original data stream k-median clustering to s...

متن کامل

Clustering categorical data streams

The data stream model has been defined for new classes of applications involving massive data being generated at a fast pace. Web click stream analysis and detection of network intrusions are two examples. Cluster analysis on data streams becomes more difficult, because the data objects in a data stream must be accessed in order and can be read only once or few times with limited resources. Rec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Algorithmica

سال: 2021

ISSN: ['1432-0541', '0178-4617']

DOI: https://doi.org/10.1007/s00453-021-00816-9